logging in or signing up ecai2 Jancis Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 73 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 12, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript PROGOL: PROGOL the newest ILP system from S. Muggleton based on the idea of inverse entailment, and on info. compression examples and background knowledge can be defined intensionally, as Prolog clauses, not just ground terms can learn with positive examples only efficiency comparable to FOILThe compression idea: The compression idea a general idea in ML, not just ILP observes that any learned hypothesis can be used to compress the data: any data set can be represented as its part covered by the hypo., and the rest observes that from Bayes’s theory, P(h | D) = max log P(D | h) + log P(h) where P(h | D) is the most probable hypo. given the data, logP(D | h) is the length of the coding of the data not covered by h, and logP(h) is the length of the coding of h we want the best compression. Why?The Progol algorithm: The Progol algorithm 1. for each positive example in turn, construct the most specific clause that implies that example (inverse entailment) 2. find a clause that is more general (in the q-subsumption sense) than the clause found in 1., and that gives the best compresion 3. remove all the clauses (pos. examples) covered by the res. of 3 4. iterate 1-3 until all pos. examples have been exhaustedInverse entailment: Inverse entailment a model-theoretic, rather than proof-theory based approach the hypothesis language is constrained wrt to types and modes (input/output arguments). the most specific clause which entails the examples is derivedExample: multiply. Mode decls: Example: multiply. Mode decls :- modeb(1,dec(+int,-int))? :- modeb(1,plus(+int,+int,-int))? :- modeb(1,inc(+int,-int))? :- modeb(1,mult(+int,+int,-int))? :- modeb(1,plus(+int,+int,-int))? :- modeh(1,plus(+int,+int,-int))? :- modeh(1,mult(+int,+int,-int))? 1 call input args output argMultiply: pos and neg examples: Multiply: pos and neg examples :- commutative(mult/3), commutative(plus/3)? :- set(c,3)? %%%%%%%%%%%%%%%%%%%%%%%%% % Positive examples mult(4,X,Y) :- plus(X,X,Z), plus(X,Z,Z1), plus(X,Z1,Y). mult(3,X,Y) :- plus(X,X,Z), plus(X,Z,Y). mult(2,X,Y) :- plus(X,X,Y). mult(1,X,X). mult(0,X,0). max clause lengthProgol examples can be intensional: Progol examples can be intensional plus(4,X,Y) :- inc(X,U), inc(U,V), inc(V,W), inc(W,Y). plus(3,X,Y) :- inc(X,U), inc(U,V), inc(V,Y). plus(2,X,Y) :- inc(X,Z), inc(Z,Y). plus(1,X,Y) :- inc(X,Y). plus(0,X,X). plus(X,0,X). %%%%%%%%%%%%%%%%%%%%%%%%% % Negative examples :- mult(0,0,1). :- mult(2,3,12). :- mult(2,4,10). :- mult(2,5,4). :-...Progol data for learning to recognize animals: Progol data for learning to recognize animals % Mode declarations :- modeh(1,class(+animal,#class))? :- modeb(1,has_milk(+animal))? :- modeb(1,has_gills(+animal))? :- modeb(1,has_covering(+animal,#covering))? :- modeb(1,has_legs(+animal,#nat))? :- modeb(1,homeothermic(+animal))? :- modeb(1,has_eggs(+animal))? % :- modeb(1,not has_milk(+animal))? % :- modeb(1,not has_gills(+animal))? :- modeb(*,habitat(+animal,#habitat))? :- modeh(1,false)? :- modeb(1,class(+animal,#class))? const. of type classSlide9: % Types animal(dog). animal(dolphin). animal(platypus).animal(bat). animal(trout). animal(herring). animal(shark). animal(eel). animal(lizard).animal(crocodile).animal(t_rex).animal(turtle). animal(snake). animal(eagle). animal(ostrich). animal(penguin). class(mammal). class(fish). class(reptile). class(bird). covering(hair). covering(none). covering(scales). covering(feathers). habitat(land). habitat(water). habitat(air). Slide10: % Positive examples class(dog,mammal). class(trout,fish). class(lizard,reptile). class(crocodile,reptile). class(eagle,bird). % Negative examples % :- class(X,mammal), class(X,fish). % :- ... :- class(trout,mammal).The most specific entailment: The most specific entailment [Generalising :- class(trout,mammal).] [Most specific clause is] :- class(A,mammal), class(A,fish), has_gills(A), has_covering(A, scales), has_legs(A,0), has_eggs(A), habitat(A,water). [C:40,42,1,0 :- class(A,mammal).] [C:5,7,0,0 :- class(A,mammal), class(A,fish).] [C:39,42,1,0 :- class(A,mammal), has_legs(A,0).] [C:39,42,1,0 :- class(A,mammal), has_eggs(A).] [C:39,42,1,0 :- class(A,mammal), habitat(A,water).] [C:38,42,1,0 :- class(A,mammal), has_eggs(A), habitat(A,water).] [C:38,42,1,0 :- class(A,mammal), has_legs(A,0), habitat(A,water).] [7 explored search nodes] [f=5,p=7,n=0,h=0] [Result of search is] :- class(A,mammal), class(A,fish).Slide12: p=# of pos. lit cov. n=# of neg lit cov h=est. of lits needed f=p-n-#lit in body-hMost specific entailment: multiplication: Most specific entailment: multiplication [Generalising mult(4,X,Y) :- plus(X,X,Z), plus(X,Z,Z1), plus(X,Z1,Y).] [Most specific clause is] mult(A,B,C) :- dec(A,D), plus(B,B,E), dec(D,F), plus(B,E,G), mult(D,B,G), dec(F,H), plus(B,G,C), plus(F,F,A), mult(F, B,E), mult(F,F,A). [C:-19,5,21,3 mult(A,B,C).] [C:-19,4,20,2 mult(A,B,C) :- dec(A,D).] [C:-13,4,14,1 mult(A,B,C) :- dec(A,D), mult(D,B,E).] [C:1,4,0,0 mult(A,B,C) :- dec(A,D), mult(D,B,E), plus(B,E,C).] [4 explored search nodes] [f=1,p=4,n=0,h=0] [Result of search is] mult(A,B,C) :- dec(A,D), mult(D,B,E), plus(B,E,C). [4 redundant clauses retracted] Compressing of the generalization: Compressing of the generalization [Generalising mult(0,X,0).] [Most specific clause is] mult(A,B,A) :- plus(A,A,A), plus(A,B,B), mult(A,A,A). [C:-1,1,2,0 mult(A,B,A).] [C:-19,2,21,0 mult(A,B,C).] [2 explored search nodes] [f=-1,p=1,n=2,h=0] [No compression] mult(0,X,0). mult(A,B,C) :- dec(A,D), mult(D,B,E), plus(B,E,C). [Total number of clauses = 2] APPLICATIONS of ILPA brief review: APPLICATIONS of ILP A brief review Natural Language Processing Computer Aided Design Drug design Musicology BiochemistryNatural Language Processing: Natural Language Processing Zelle, J.M. and Mooney, R.J., Learning Semantic Grammars with Constructive Inductive Logic Programming, Procs. of the 11th AAAI, pp. 817-822, Washington, D.C., 1993 the task: assignment of a semantic category (actor, instrument, …) in a parser of English, e.g. the girl hit the window hit agt: girl the hammer hit the window hit inst: hammer the girl ate the pasta with the cheese pat(obj):[pasta accomp: cheese] the girl at the pasta with the fork pat(verb): pasta inst: fork the idea is to learn the parser from examplesSlide17: the parser can be represented easily as a logic program learning requires a relational representation, as there are relations between parts of speech that are to be reflected in the parser initially, overly general rules are learned: op([Top, Second|RestStack], Input, [NewTop|RestStack], Input) :- combine(Top, agt, Second, NewTop) ILP is applied to specialize overly general rules op([A, [B|det:the]],C,[D],C) :- animate(B), attach(A, agt, B, D). animate(girl). animate(boy). animate(cat).Slide18: as the new predicate animate is invented, constructive ILP techniques are used. predicate invention extends the language: the examples never included the concept of animate another invented predicate was useful to make parsing distinctions: instr_phrase([with,the,X]) :- instrument(X). instrument(fork). instrument(hammer).… the resulting parser achieved 92% accuracy with just 150 training examples (sentences) CAD: Finite Element mesh design(B. Dolsak, S. Muggleton, Application of ILP to Finite Element Mesh Design, in ILP, S. Muggleton, ed., pp. 453-472): CAD: Finite Element mesh design (B. Dolsak, S. Muggleton, Application of ILP to Finite Element Mesh Design, in ILP, S. Muggleton, ed., pp. 453-472) stress analysis in structures structure = finite collection of elements (meshes) Slide20: the important question is the one of resolution: how many elements there should be on any given edge for the mesh to correctly propagate the stress? considerable number of examples is published general rules can be learned from these examples essentially, a relational problem! training instances: +mesh(b1,6) -mesh(b1,1) +mesh(b2,1) -mesh(b2,2) +mesh(b3,6) -mesh(b3,2) +… -…Slide21: background knowledge: types of edges (important_long, _short, not_important, circuit, half_circuit,…) boundary conditions(free, one_side_fixed,…), loadings (not_loaded,one_side_loaded,…fully_loaded) geometry: neighbor, opposite, equal neighbor(b1,b2) opposite(b1,b3) eq(b1,b3) neighbor(b2,b3) opposite(b10,b33) eq(b3,b6)Slide22: examples of results: mesh(A,B) ¬ not_important(A), B=1. mesh(A,B) ¬ quarter_circuit(A), B=9. mesh(A,B) ¬ opposite(A,C),not_important(C), short(A),B=2 . when using FOIL, problems with neighbor: lack of discriminatory power – clichés!Predicting Structure/Activity Relationships(drug design): Predicting Structure/Activity Relationships (drug design) predicting activity from molecular properties (hydrophobicity, sigma effect, molecular reflectivity, LUMO) limited in the representation of molecular connectivity determinate constraints require that ea. atom is bonded to at most one other atommutagenicity problem: mutagenicity problem there are some statistical results, using regression, as baseline 230 compounds were classified for high/low mutagenicity 188 ‘regression-friendly’ compunds were classified by PROGOL with accuracy matching both regression-bnased and ANN classifier rules are simple and comprehensible, e.g. LUMO value <= -1.570 and carbon atom merging six-membered aromatic ringsSlide25: remaining 42 compounds: PROGOL requires presence of 5-member aromatic carbon ring with a nitrogen atom linked by a single bond followed by a double bond This is a true scientific discovery to appear in NATUREMusical application (M. Dovey, Oxford): Musical application (M. Dovey, Oxford) goal: to learn the style of a pianist (Rachmaniniov) from a recording piano dataset (dynamics) knowledge representation: relational, e.g. we want ot say that the current note is higher than the previous one structural and performance data combined Slide29: Results say something about Rachmaninov’s playing style: e.g. rubato_half(A) <- pre(B, A),, touch_staccato(A), rythm-quaver(A), rubato_detach_half(B) note is played half its length if the note is marked staccato and the gap after the note is lengthened up to four notes are connected learned rules are typically 70-90% correct Conclusion: Conclusion many ILP systems are capable of constructive learning there are good theoretical foundations allowing firm statements on learnability, generality, etc. there is a number of exciting applications more and more robust implementations are developed, enabling scaling-up of problemsProtein secondary structure prediction using GOLEMMuggleton, King, Sternberg, Protein Engineering 1992: Protein secondary structure prediction using GOLEM Muggleton, King, Sternberg, Protein Engineering 1992 Slide32: Protein: 1CRN (Crambin): ttccpsivarsfnvcrlpgtpeaicatgtgciipgatcpgdyan aminoacid sequence ------HHHHHHHHHHH-----HHHHHHHH-------------- secondary structure -----HHHH-----HHH----HHHHHHHHHH------------- GOLEM ------------------------HHH----------------- “GOR” (standard method in biochem.) H = alpha-helix - = coilSlide33: T = 46 Wgolem = 14 WGOR = 3 Xgolem = 23 XGOR = 27 Qgolem = 100*(37/46) = 80.4 XGOR = 100*(30/46)n= 65.2 Slide34: Representation - “skeleton” predicates alpha(A, B) residue in pos. B in protein A is an alpha-helix, eg. alpha(1CRN, 7) octf(D, E, F, G, B, H, I, J, K) defines a “window” of positions around B position(A, B, N) residue at position B in protein A has primary structure N, eg. position(1CRN, 7, i)Representation - attributes (unary predicates): Representation - attributes (unary predicates) hydrpohobic(X) large(X), small(X), tiny(X) aromatic(X), non-aromatic(X) not_p(X) X is not proline not_k(X) X is not lysineLearning: Learning Some secondary structures are given to GOLEM, together with their primary structures classification rules (level 0) are learned, eg. alpha0(A, B) :- octf(D,E,F,G,B,H,I,J,K), pos(A,F,Q), not_p(Q), pos(A,G,N), not_aromatic(N), not_p(N), pos(A,B,C), large(C), not_aromatic(C), not_k(C), pos(A,H,L), …Slide37: Level 1 rules are obtained by adding level 0 rules to the background knowledge, and GOLEM was re-run (level 0 rules predicted only very short sequences). Level 2 rules were obtained by adding level 1 rules to background knowledge. 21 level 0 rules, 5 level 1 rules and 2 level 2 rules were obtained, e.g. alpha1(A, B) :- octf(D,E,F,G,B,H,I,J,K), alpha0(A,H), alpha0(A,I) level 1 alpha2(A,B) :- octf(C,D,E,F,B,G,H,I,J), alpha1(A,G), alpha1(A,H), alpha1(A,I)Slide38: a residue is classified as an alpha-helix if the condition of any alpha0, alpha1, alpha2 rule is true, e.g. the sequence kpkppp and the rule alpha0(A,B) if octf(D,E,F,G,B,H,I,J,K), pos(A,B,N), small(N), pos(A,G,M), not_k(M) and the level 1 and level 2 rules above. We have kkkpppp HHH level 0 H level 1 _______________________ ---HHHHConclusion: Conclusion an emerging new field combining logic programming and machine learning goes beyond representational limitations of the standard attribute-value approaches bottom-up systems (e.g. GOLEM) use inverse resolution and/or RLGG. Inverse resolution has its limits. Inverse implication is more powerful, but also computationally expensive top-down systems (e.g. FOIL) specialize clauses that cover positive examples. They use heuristics to improve this search. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ecai2 Jancis Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 73 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 12, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript PROGOL: PROGOL the newest ILP system from S. Muggleton based on the idea of inverse entailment, and on info. compression examples and background knowledge can be defined intensionally, as Prolog clauses, not just ground terms can learn with positive examples only efficiency comparable to FOILThe compression idea: The compression idea a general idea in ML, not just ILP observes that any learned hypothesis can be used to compress the data: any data set can be represented as its part covered by the hypo., and the rest observes that from Bayes’s theory, P(h | D) = max log P(D | h) + log P(h) where P(h | D) is the most probable hypo. given the data, logP(D | h) is the length of the coding of the data not covered by h, and logP(h) is the length of the coding of h we want the best compression. Why?The Progol algorithm: The Progol algorithm 1. for each positive example in turn, construct the most specific clause that implies that example (inverse entailment) 2. find a clause that is more general (in the q-subsumption sense) than the clause found in 1., and that gives the best compresion 3. remove all the clauses (pos. examples) covered by the res. of 3 4. iterate 1-3 until all pos. examples have been exhaustedInverse entailment: Inverse entailment a model-theoretic, rather than proof-theory based approach the hypothesis language is constrained wrt to types and modes (input/output arguments). the most specific clause which entails the examples is derivedExample: multiply. Mode decls: Example: multiply. Mode decls :- modeb(1,dec(+int,-int))? :- modeb(1,plus(+int,+int,-int))? :- modeb(1,inc(+int,-int))? :- modeb(1,mult(+int,+int,-int))? :- modeb(1,plus(+int,+int,-int))? :- modeh(1,plus(+int,+int,-int))? :- modeh(1,mult(+int,+int,-int))? 1 call input args output argMultiply: pos and neg examples: Multiply: pos and neg examples :- commutative(mult/3), commutative(plus/3)? :- set(c,3)? %%%%%%%%%%%%%%%%%%%%%%%%% % Positive examples mult(4,X,Y) :- plus(X,X,Z), plus(X,Z,Z1), plus(X,Z1,Y). mult(3,X,Y) :- plus(X,X,Z), plus(X,Z,Y). mult(2,X,Y) :- plus(X,X,Y). mult(1,X,X). mult(0,X,0). max clause lengthProgol examples can be intensional: Progol examples can be intensional plus(4,X,Y) :- inc(X,U), inc(U,V), inc(V,W), inc(W,Y). plus(3,X,Y) :- inc(X,U), inc(U,V), inc(V,Y). plus(2,X,Y) :- inc(X,Z), inc(Z,Y). plus(1,X,Y) :- inc(X,Y). plus(0,X,X). plus(X,0,X). %%%%%%%%%%%%%%%%%%%%%%%%% % Negative examples :- mult(0,0,1). :- mult(2,3,12). :- mult(2,4,10). :- mult(2,5,4). :-...Progol data for learning to recognize animals: Progol data for learning to recognize animals % Mode declarations :- modeh(1,class(+animal,#class))? :- modeb(1,has_milk(+animal))? :- modeb(1,has_gills(+animal))? :- modeb(1,has_covering(+animal,#covering))? :- modeb(1,has_legs(+animal,#nat))? :- modeb(1,homeothermic(+animal))? :- modeb(1,has_eggs(+animal))? % :- modeb(1,not has_milk(+animal))? % :- modeb(1,not has_gills(+animal))? :- modeb(*,habitat(+animal,#habitat))? :- modeh(1,false)? :- modeb(1,class(+animal,#class))? const. of type classSlide9: % Types animal(dog). animal(dolphin). animal(platypus).animal(bat). animal(trout). animal(herring). animal(shark). animal(eel). animal(lizard).animal(crocodile).animal(t_rex).animal(turtle). animal(snake). animal(eagle). animal(ostrich). animal(penguin). class(mammal). class(fish). class(reptile). class(bird). covering(hair). covering(none). covering(scales). covering(feathers). habitat(land). habitat(water). habitat(air). Slide10: % Positive examples class(dog,mammal). class(trout,fish). class(lizard,reptile). class(crocodile,reptile). class(eagle,bird). % Negative examples % :- class(X,mammal), class(X,fish). % :- ... :- class(trout,mammal).The most specific entailment: The most specific entailment [Generalising :- class(trout,mammal).] [Most specific clause is] :- class(A,mammal), class(A,fish), has_gills(A), has_covering(A, scales), has_legs(A,0), has_eggs(A), habitat(A,water). [C:40,42,1,0 :- class(A,mammal).] [C:5,7,0,0 :- class(A,mammal), class(A,fish).] [C:39,42,1,0 :- class(A,mammal), has_legs(A,0).] [C:39,42,1,0 :- class(A,mammal), has_eggs(A).] [C:39,42,1,0 :- class(A,mammal), habitat(A,water).] [C:38,42,1,0 :- class(A,mammal), has_eggs(A), habitat(A,water).] [C:38,42,1,0 :- class(A,mammal), has_legs(A,0), habitat(A,water).] [7 explored search nodes] [f=5,p=7,n=0,h=0] [Result of search is] :- class(A,mammal), class(A,fish).Slide12: p=# of pos. lit cov. n=# of neg lit cov h=est. of lits needed f=p-n-#lit in body-hMost specific entailment: multiplication: Most specific entailment: multiplication [Generalising mult(4,X,Y) :- plus(X,X,Z), plus(X,Z,Z1), plus(X,Z1,Y).] [Most specific clause is] mult(A,B,C) :- dec(A,D), plus(B,B,E), dec(D,F), plus(B,E,G), mult(D,B,G), dec(F,H), plus(B,G,C), plus(F,F,A), mult(F, B,E), mult(F,F,A). [C:-19,5,21,3 mult(A,B,C).] [C:-19,4,20,2 mult(A,B,C) :- dec(A,D).] [C:-13,4,14,1 mult(A,B,C) :- dec(A,D), mult(D,B,E).] [C:1,4,0,0 mult(A,B,C) :- dec(A,D), mult(D,B,E), plus(B,E,C).] [4 explored search nodes] [f=1,p=4,n=0,h=0] [Result of search is] mult(A,B,C) :- dec(A,D), mult(D,B,E), plus(B,E,C). [4 redundant clauses retracted] Compressing of the generalization: Compressing of the generalization [Generalising mult(0,X,0).] [Most specific clause is] mult(A,B,A) :- plus(A,A,A), plus(A,B,B), mult(A,A,A). [C:-1,1,2,0 mult(A,B,A).] [C:-19,2,21,0 mult(A,B,C).] [2 explored search nodes] [f=-1,p=1,n=2,h=0] [No compression] mult(0,X,0). mult(A,B,C) :- dec(A,D), mult(D,B,E), plus(B,E,C). [Total number of clauses = 2] APPLICATIONS of ILPA brief review: APPLICATIONS of ILP A brief review Natural Language Processing Computer Aided Design Drug design Musicology BiochemistryNatural Language Processing: Natural Language Processing Zelle, J.M. and Mooney, R.J., Learning Semantic Grammars with Constructive Inductive Logic Programming, Procs. of the 11th AAAI, pp. 817-822, Washington, D.C., 1993 the task: assignment of a semantic category (actor, instrument, …) in a parser of English, e.g. the girl hit the window hit agt: girl the hammer hit the window hit inst: hammer the girl ate the pasta with the cheese pat(obj):[pasta accomp: cheese] the girl at the pasta with the fork pat(verb): pasta inst: fork the idea is to learn the parser from examplesSlide17: the parser can be represented easily as a logic program learning requires a relational representation, as there are relations between parts of speech that are to be reflected in the parser initially, overly general rules are learned: op([Top, Second|RestStack], Input, [NewTop|RestStack], Input) :- combine(Top, agt, Second, NewTop) ILP is applied to specialize overly general rules op([A, [B|det:the]],C,[D],C) :- animate(B), attach(A, agt, B, D). animate(girl). animate(boy). animate(cat).Slide18: as the new predicate animate is invented, constructive ILP techniques are used. predicate invention extends the language: the examples never included the concept of animate another invented predicate was useful to make parsing distinctions: instr_phrase([with,the,X]) :- instrument(X). instrument(fork). instrument(hammer).… the resulting parser achieved 92% accuracy with just 150 training examples (sentences) CAD: Finite Element mesh design(B. Dolsak, S. Muggleton, Application of ILP to Finite Element Mesh Design, in ILP, S. Muggleton, ed., pp. 453-472): CAD: Finite Element mesh design (B. Dolsak, S. Muggleton, Application of ILP to Finite Element Mesh Design, in ILP, S. Muggleton, ed., pp. 453-472) stress analysis in structures structure = finite collection of elements (meshes) Slide20: the important question is the one of resolution: how many elements there should be on any given edge for the mesh to correctly propagate the stress? considerable number of examples is published general rules can be learned from these examples essentially, a relational problem! training instances: +mesh(b1,6) -mesh(b1,1) +mesh(b2,1) -mesh(b2,2) +mesh(b3,6) -mesh(b3,2) +… -…Slide21: background knowledge: types of edges (important_long, _short, not_important, circuit, half_circuit,…) boundary conditions(free, one_side_fixed,…), loadings (not_loaded,one_side_loaded,…fully_loaded) geometry: neighbor, opposite, equal neighbor(b1,b2) opposite(b1,b3) eq(b1,b3) neighbor(b2,b3) opposite(b10,b33) eq(b3,b6)Slide22: examples of results: mesh(A,B) ¬ not_important(A), B=1. mesh(A,B) ¬ quarter_circuit(A), B=9. mesh(A,B) ¬ opposite(A,C),not_important(C), short(A),B=2 . when using FOIL, problems with neighbor: lack of discriminatory power – clichés!Predicting Structure/Activity Relationships(drug design): Predicting Structure/Activity Relationships (drug design) predicting activity from molecular properties (hydrophobicity, sigma effect, molecular reflectivity, LUMO) limited in the representation of molecular connectivity determinate constraints require that ea. atom is bonded to at most one other atommutagenicity problem: mutagenicity problem there are some statistical results, using regression, as baseline 230 compounds were classified for high/low mutagenicity 188 ‘regression-friendly’ compunds were classified by PROGOL with accuracy matching both regression-bnased and ANN classifier rules are simple and comprehensible, e.g. LUMO value <= -1.570 and carbon atom merging six-membered aromatic ringsSlide25: remaining 42 compounds: PROGOL requires presence of 5-member aromatic carbon ring with a nitrogen atom linked by a single bond followed by a double bond This is a true scientific discovery to appear in NATUREMusical application (M. Dovey, Oxford): Musical application (M. Dovey, Oxford) goal: to learn the style of a pianist (Rachmaniniov) from a recording piano dataset (dynamics) knowledge representation: relational, e.g. we want ot say that the current note is higher than the previous one structural and performance data combined Slide29: Results say something about Rachmaninov’s playing style: e.g. rubato_half(A) <- pre(B, A),, touch_staccato(A), rythm-quaver(A), rubato_detach_half(B) note is played half its length if the note is marked staccato and the gap after the note is lengthened up to four notes are connected learned rules are typically 70-90% correct Conclusion: Conclusion many ILP systems are capable of constructive learning there are good theoretical foundations allowing firm statements on learnability, generality, etc. there is a number of exciting applications more and more robust implementations are developed, enabling scaling-up of problemsProtein secondary structure prediction using GOLEMMuggleton, King, Sternberg, Protein Engineering 1992: Protein secondary structure prediction using GOLEM Muggleton, King, Sternberg, Protein Engineering 1992 Slide32: Protein: 1CRN (Crambin): ttccpsivarsfnvcrlpgtpeaicatgtgciipgatcpgdyan aminoacid sequence ------HHHHHHHHHHH-----HHHHHHHH-------------- secondary structure -----HHHH-----HHH----HHHHHHHHHH------------- GOLEM ------------------------HHH----------------- “GOR” (standard method in biochem.) H = alpha-helix - = coilSlide33: T = 46 Wgolem = 14 WGOR = 3 Xgolem = 23 XGOR = 27 Qgolem = 100*(37/46) = 80.4 XGOR = 100*(30/46)n= 65.2 Slide34: Representation - “skeleton” predicates alpha(A, B) residue in pos. B in protein A is an alpha-helix, eg. alpha(1CRN, 7) octf(D, E, F, G, B, H, I, J, K) defines a “window” of positions around B position(A, B, N) residue at position B in protein A has primary structure N, eg. position(1CRN, 7, i)Representation - attributes (unary predicates): Representation - attributes (unary predicates) hydrpohobic(X) large(X), small(X), tiny(X) aromatic(X), non-aromatic(X) not_p(X) X is not proline not_k(X) X is not lysineLearning: Learning Some secondary structures are given to GOLEM, together with their primary structures classification rules (level 0) are learned, eg. alpha0(A, B) :- octf(D,E,F,G,B,H,I,J,K), pos(A,F,Q), not_p(Q), pos(A,G,N), not_aromatic(N), not_p(N), pos(A,B,C), large(C), not_aromatic(C), not_k(C), pos(A,H,L), …Slide37: Level 1 rules are obtained by adding level 0 rules to the background knowledge, and GOLEM was re-run (level 0 rules predicted only very short sequences). Level 2 rules were obtained by adding level 1 rules to background knowledge. 21 level 0 rules, 5 level 1 rules and 2 level 2 rules were obtained, e.g. alpha1(A, B) :- octf(D,E,F,G,B,H,I,J,K), alpha0(A,H), alpha0(A,I) level 1 alpha2(A,B) :- octf(C,D,E,F,B,G,H,I,J), alpha1(A,G), alpha1(A,H), alpha1(A,I)Slide38: a residue is classified as an alpha-helix if the condition of any alpha0, alpha1, alpha2 rule is true, e.g. the sequence kpkppp and the rule alpha0(A,B) if octf(D,E,F,G,B,H,I,J,K), pos(A,B,N), small(N), pos(A,G,M), not_k(M) and the level 1 and level 2 rules above. We have kkkpppp HHH level 0 H level 1 _______________________ ---HHHHConclusion: Conclusion an emerging new field combining logic programming and machine learning goes beyond representational limitations of the standard attribute-value approaches bottom-up systems (e.g. GOLEM) use inverse resolution and/or RLGG. Inverse resolution has its limits. Inverse implication is more powerful, but also computationally expensive top-down systems (e.g. FOIL) specialize clauses that cover positive examples. They use heuristics to improve this search.