PRMs 02 11

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

SRL Approaches: Frame-based Probabilistic models: 

SRL Approaches: Frame-based Probabilistic models February 11, 2005

Today’s Outline: 

Today’s Outline Finish w/ Graphical Models Introduction Families of SRL Approaches Frame-based Probabilistic approaches Probabilistic Relational Models (PRMs) Probabilistic Entity Relation (PERs)

SRL History: 

SRL History In general, SRL combines logic and probabilities Historically, there are two general threads of research The first takes graphical models or hierarchical Bayesian models and adds in some form of relational/logical representation examples: Probabilistic Relational Models (PRMs), Probabilistic Entity Relation Models (PERs), Object Oriented Bayesian Networks (OOBNs) comes largely from the Uncertainty in AI (UAI} community The second takes a logical representation (first-order logic, horn clauses, etc) and adds in some form of probabilities examples: Bayesian Logic Programs (BLPs), Stochastic Logic Programs (SLPs) comes largely from the Inductive Logic Programming (ILP) community

Families of SRL Approaches: 

Families of SRL Approaches Frame-based Probabilistic Models Probabilistic Relational Models (PRMs), Probabilistic Entity Relation Models (PERs), Object Oriented Bayesian Networks (OOBNs) First Order Probabilistic Logic (FOPL) BLOGs Relational Markov Logic (RML) Stochastic Functional Programs PRISM Stochastic Logic Programs (SLPs) IBAL

SRL Dimensions: 

SRL Dimensions Syntax – ‘logic-based’ vs. ‘schema-based’ Logical Semantics – relational vs. first-order domain closure/closed world vs. open world Probabilistic Semantics – ‘possible worlds’ vs. ‘domain frequencies’ directed vs. undirected models … others?

Today: PRMs: 

Today: PRMs Developed by Daphne Koller’s group at Stanford representation: Avi Pfeffer builds on work in KBMC (knowledge-based model construction) by Haddawy, Poole, Wellman and others… Object Oriented Bayesian Networks Relational Probability Models learning: myself, Nir Friedman, Avi Attribute Uncertainty Structural Uncertainty Class Uncertainty Identity Uncertainty undirected models: Ben Taskar

Motivation: Discovering Patterns in Structured Data: 

Motivation: Discovering Patterns in Structured Data

Learning Statistical Models: 

Learning Statistical Models Traditional approaches work well with flat representations fixed length attribute-value vectors assume independent (IID) sample Problems: introduces statistical skew loses relational structure incapable of detecting link-based patterns must fix attributes in advance

Roadmap: 

Roadmap Background: Bayesian Networks (BNs) [Pearl, 1988] Probabilistic Relational Models (PRMs) Learning PRMs w/ Attribute Uncertainty PRMs w/ Structural Uncertainty PRMs w/ Class Hierarchies

Bayesian Networks: 

Bayesian Networks nodes = random variables edges = direct probabilistic influence Network structure encodes independence assumptions: XRay conditionally independent of Pneumonia given Infiltrates

Bayesian Networks: 

Bayesian Networks Associated with each node Xi there is a conditional probability distribution P(Xi|Pai:) — distribution over Xi for each assignment to parents If variables are discrete, P is usually multinomial P can be linear Gaussian, mixture of Gaussians, … 0.8 0.2 p t p 0.6 0.4 0.01 0.99 0.2 0.8 t p t t p T P P(I |P, T )

BN Semantics: 

BN Semantics Compact & natural representation: nodes have  k parents  2k n vs. 2n params conditional independencies in BN structure + local probability models full joint distribution over domain =

Queries: 

Queries Full joint distribution specifies answer to any query: P(variable | evidence about others) XRay Lung Infiltrates Sputum Smear Tuberculosis Pneumonia XRay Sputum Smear

BN Learning: 

BN Learning BN models can be learned from empirical data parameter estimation via numerical optimization structure learning via combinatorial search. BN hypothesis space biased towards distributions with independence structure.

Roadmap: 

Roadmap Background: Bayesian Networks (BNs) Probabilistic Relational Models (PRMs) Learning PRMs w/ Attribute Uncertainty PRMs w/ Structural Uncertainty PRMs w/ Class Hierarchies

Probabilistic Relational Models : 

Probabilistic Relational Models Combine advantages of relational logic & Bayesian networks: natural domain modeling: objects, properties, relations; generalization over a variety of situations; compact, natural probability models. Integrate uncertainty with relational model: properties of domain entities can depend on properties of related entities; uncertainty over relational structure of domain.

Relational Schema: 

Relational Schema Strain Unique Infectivity Infected with Interacted with Describes the types of objects and relations in the database Contact Close-Contact Skin-Test Age Patient Homeless HIV-Result Ethnicity Disease-Site Contact-Type

Probabilistic Relational Model: 

Probabilistic Relational Model Close-Contact Transmitted Contact-Type Disease Site Strain Unique Patient Homeless HIV-Result POB Contact Age

Relational Skeleton : 

Relational Skeleton Fixed relational skeleton  set of objects in each class relations between them Uncertainty over assignment of values to attributes PRM defines distribution over instantiations of attributes

A Portion of the BN: 

A Portion of the BN P1.Disease Site P1.Homeless P1.POB C1.Close-Contact C1.Transmitted C1.Contact-Type C1.Age C2.Close-Contact C2.Transmitted C2.Contact-Type C2.Age

PRM: Aggregate Dependencies: 

PRM: Aggregate Dependencies sum, min, max, avg, mode, count Disease Site Patient Homeless POB Age Close-Contact Contact-Type Contact

PRM with AU Semantics: 

PRM with AU Semantics PRM relational skeleton  + = Strain s1 Patient p1 Patient p2 Contact c3 Contact c2 Contact c1 Strain s2 Patient p3

Next Time 2/18: 

Next Time 2/18 Structural Uncertainty Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller, B. Taskar. Journal of Machine Learning Research, 2002. http://www.cs.umd.edu/class/spring2005/cmsc828g/Readings/jmlr02.pdf Class Uncertainty PRMs with Class Hierarchies, chapter 5 of Learning Statistical Models of Relational Data, Lise Getoor, PhD Thesis, Stanford University, 2001. http://www.cs.umd.edu/class/spring2005/cmsc828g/Readings/thesis-ch5.pdf PERS Probabilistic Models for Relational Data, David Heckerman, Christopher Meek and Daphne Koller ftp://ftp.research.microsoft.com/pub/tr/TR-2004-30.pdf Background on Learning Graphical Models available in CS Library

Today’s Outline 2/18: 

Today’s Outline 2/18 Frame-based Probabilistic approaches Probabilistic Relational Models (PRMs) Learning PRMs PRMs w/ Structural Uncertainty PRMs w/ Class Hierarchies Probabilistic Entity Relation (PERs)

Learning PRMs w/ AU: 

Learning PRMs w/ AU Database Patient Strain Contact Relational Schema Patient Contact Strain Parameter estimation Structure selection

Parameter Estimation in PRMs: 

Parameter Estimation in PRMs Assume known dependency structure S Goal: estimate PRM parameters q entries in local probability models, q is good if it is likely to generate the observed data, instance I . MLE Principle: Choose q* so as to maximize l As in Bayesian network learning, crucial property: decomposition separate terms for different X.A

ML Parameter Estimation: 

ML Parameter Estimation Contact CloseContact Transmitted Patient HIV DiseaseSite

Structure Selection: 

Structure Selection Idea: define scoring function do local search over legal structures Key Components: legal models scoring models searching model space

Structure Selection: 

Structure Selection Idea: define scoring function do local search over legal structures Key Components: legal models scoring models searching model space

Legal Models: 

Legal Models author-of PRM defines a coherent probability model over a skeleton  if the dependencies between object attributes is acyclic How do we guarantee that a PRM is acyclic for every skeleton? Researcher Prof. Gump Reputation high Paper P1 Accepted yes Paper P2 Accepted yes sum

Attribute Stratification: 

Attribute Stratification PRM dependency structure S dependency graph Paper.Accecpted Researcher.Reputation if Researcher.Reputation depends directly on Paper.Accepted Algorithm more flexible; allows certain cycles along guaranteed acyclic relations

Structure Selection: 

Structure Selection Idea: define scoring function do local search over legal structures Key Components: legal models scoring models searching model space

Scoring Models: 

Scoring Models Bayesian approach: Standard approach to scoring models; used in Bayesian network learning

Structure Selection: 

Structure Selection Idea: define scoring function do local search over legal structures Key Components: legal models scoring models searching model space

Searching Model Space: 

Searching Model Space Contact Strain Patient Phase 0: consider only dependencies within a class

Phased Structure Search: 

Contact Strain Patient  score Add P.HC.T Contact Strain Patient Contact Patient Strain Phase 1: consider dependencies from “neighboring” classes, via schema relations Phased Structure Search

Phased Structure Search: 

Phased Structure Search  score Add C.PS.I Phase 2: consider dependencies from “further” classes, via relation chains Contact Strain Patient Contact Strain Patient Contact Strain Patient

Experimental Evaluation: 

Experimental Evaluation

Synthetic Data: 

Synthetic Data Simple ‘genetic’ domain Construct training set of various sizes Compare the log-likelihood of test set of size 100,000 ‘gold’ standard model Learn parameters (model structure given) Learn model (learn both structure and parameters)

Slide40: 

Blood Type M-chromosome P-chromosome Person Result Contaminated Blood Test Blood Type M-chromosome P-chromosome Person Blood Type M-chromosome P-chromosome Person (Father) (Mother)

Error on Test Set: 

Error on Test Set

Error Variance: 

Error Variance

Errors in Learned Structure: 

Errors in Learned Structure

TB Cases in SF: 

TB Cases in SF Patient (2300) Ethnicity Homeless Age @ diagnosis HIV result Disease-site X-ray Contact (20000) Contact-type Age Care Infected Strain (1000) Unique Drug-Resistance

TB PRM: 

hivres # contacts result transmitted infectivity smrpos care closecont ageatdx closecont hh_oohh ethnic # infected % infected hh_oohh contype homeless gender contype disease site contage xray pob Contact Strain Subcase Patient TB PRM

SEC PRM: 

total assets # roles rtn earn assets age  rtn assets fired # employees top_role top_role  total_assets retired retired salary salary Company Role Prev-Role Person SEC PRM 20,000 120,000 40,000

Your turn…: 

Your turn… Describe your focus problem What would a PRM for (an aspect of) your focus problem look like?

Roadmap: 

Roadmap Motivation and Background PRMs w/ Attribute Uncertainty PRMs w/ Structural Uncertainty PRMs w/ Class Hierarchies

An Example: 

An Example Topic Theory AI Agent Scientific Paper Attributes of object Attributes of linked objects Attributes of heterogeneous linked objects Collective Classification

Structural Uncertainty: 

Structural Uncertainty Motivation: relational structure provides useful information for density estimation and prediction Construct probabilistic models of relational structure that capture structural uncertainty Two new mechanisms: Reference uncertainty Existence uncertainty

PRMs w/ AU: another example: 

PRMs w/ AU: another example Vote Movie Person PRM consists of: Relational Schema

PRM w/ Attribute Uncertainty: 

Fixed relational skeleton : set of objects in each class relations between them Movie m1 Vote v1 Movie: m1 Person: p1 Person p2 Person p1 Movie m2 Uncertainty over assignment of values to attributes PRM w/ Attribute Uncertainty Vote v2 Movie: m1 Person: p2 Vote v3 Movie: m2 Person: p2

PRM w/ AU Semantics: 

PRM w/ AU Semantics PRM relational skeleton  + = Patient p2 Vote Movie Person Movie Vote Vote Person Person Movie Vote

Issue: 

Issue PRM w/ AU applicable only in domains where we have full knowledge of the relational structure Next we introduce PRMs which allow uncertainty over relational structure…

PRMs w/ Structural Uncertainty : 

PRMs w/ Structural Uncertainty Advantages: Applicable in cases where we do not have full knowledge of relational structure Incorporating uncertainty over relational structure into probabilistic model can improve predictive accuracy Two approaches: Reference uncertainty Existence uncertainty Different probabilistic models; varying amount of background knowledge required for each

Citation Relational Schema: 

Citation Relational Schema Wrote Paper Topic Word1 WordN … Word2 Paper Topic Word1 WordN … Word2 Cites Count Citing Paper Cited Paper Author Institution Research Area

Attribute Uncertainty: 

Attribute Uncertainty Paper Word1 Topic WordN Wrote Author ... Research Area Institution

Reference Uncertainty: 

Reference Uncertainty Bibliography Scientific Paper ` 1. ----- 2. ----- 3. ----- Document Collection

PRM w/ Reference Uncertainty: 

PRM w/ Reference Uncertainty Cites Citing Cited Dependency model for foreign keys Paper Topic Words Paper Topic Words Naïve Approach: multinomial over primary key noncompact limits ability to generalize

Reference Uncertainty Example: 

Reference Uncertainty Example Paper P5 Topic AI Paper P4 Topic AI Paper P3 Topic AI Paper M2 Topic AI Paper P1 Topic Theory Cites Citing Cited Paper P5 Topic AI Paper P3 Topic AI Paper P4 Topic Theory Paper P2 Topic Theory Paper P1 Topic Theory Paper.Topic = AI Paper.Topic = Theory P1 P2

PRMs w/ RU Semantics: 

PRMs w/ RU Semantics Cites Cited Citing Paper Topic Words Paper Topic Words PRM RU

Learning PRMs w/ RU: 

Learning PRMs w/ RU Idea: just like in PRMs w/ AU define scoring function do greedy local structure search Issues: expanded search space construct partitions new operators

Learning : 

Learning Idea: define scoring function do phased local search over legal structures Key Components: legal models scoring models searching model space PRMs w/ RU model new dependencies new operators unchanged

Structure Search: New Operators: 

Structure Search: New Operators Cites Citing Cited Cited Author Institution

PRMs w/ RU Summary: 

PRMs w/ RU Summary Define semantics for uncertainty over foreign-key values Search now includes operators Refine and Abstract for constructing foreign-key dependency model Provides one simple mechanism for link uncertainty

Existence Uncertainty: 

Existence Uncertainty Document Collection Document Collection

PRM w/ Exists Uncertainty: 

PRM w/ Exists Uncertainty Cites Dependency model for existence of relationship Paper Topic Words Paper Topic Words Exists

Exists Uncertainty Example: 

Exists Uncertainty Example Cites Paper Topic Words Paper Topic Words Exists Citer.Topic Cited.Topic False True

PRMs w/ EU Semantics: 

PRMs w/ EU Semantics PRM EU Cites Exists Paper Topic Words Paper Topic Words

Learning PRMs w/ EU: 

Learning PRMs w/ EU Idea: just like in PRMs w/ AU define scoring function do greedy local structure search Issues: efficiency Computation of sufficient statistics for exists attribute Do not explicitly consider relations that do not exist

Structure Selection: 

Structure Selection Idea: define scoring function do phased local search over legal structures Key Components: legal models scoring models searching model space PRMs w/ EU model new dependencies unchanged unchanged

Slide72: 

Experiment I: EachMovie+ age personal_income household_income Person Movie Actor MOVIE ROLE VOTE PERSON ACTOR * © 1999 -2000 Internet Movie Database Limited † http://www.research.digital.com/SRC/EachMovie Size: 1600 Size: 35,000 Size: 50,000 Size: 25,000 Size: 300,000 * †

Slide73: 

EachMovie+ PRM-RU thriller horror gender theater_status gender video_status age animation art_foreign classic personal_income comedy drama rank household_income family Movie Person Movie Actor MOVIE ROLE VOTE PERSON ACTOR education

Slide74: 

EachMovie+ PRM-EU comedy drama rank gender family personal_income horror romance exists household_income thriller exists gender theater_status action education animation art_foreign classic MOVIE ROLE VOTE PERSON ACTOR

Experiment II: Prediction: 

Experiment II: Prediction Paper P506 Topic ?? w1 wN . . .

Domains: 

Domains Cites Exists Paper Topic w1 wN . . . Paper Topic w1 wN . . . cited paper citing paper Cora Dataset, McCallum, et. al Link Exists Web Page Category w1 wN . . . Category w1 wN . . . From Page To Page Web Page WebKB, Craven, et. al

Prediction Accuracy : 

Prediction Accuracy

Prediction Accuracy : 

Prediction Accuracy

Experiment III: Collective Classification: 

Experiment III: Collective Classification Paper#2 Topic Paper#3 Topic WordN Paper#1 Word1 Topic ... ... ... Author#1 Area Inst #1-#2 Author#2 Area Inst Exists #2-#3 Exists #2-#1 Exists #3-#1 Exists #1-#3 Exists WordN Word1 WordN Word1 Exists Topic Topic Area Area #3-#2

Inference in Unrolled BN: 

Inference in Unrolled BN Prediction requires inference in “unrolled” network Infeasible for large networks Use approximate inference for E-step Loopy belief propagation (Pearl, 88; McEliece, 98) Scales linearly with size of network Guaranteed to converge only for polytrees Empirically, often converges in general nets (Murphy,99) Local message passing Belief messages transferred between related instances Induces a natural “influence” propagation behavior Instances give information about related instances

Web Domain: 

... From-Page Category Word1 WordN Exists From To Link Hub To-Page Word Anchor Has ... Category Word1 WordN Hub Web Domain

WebKB Results*: 

WebKB Results* * from “Probabilistic Models of Text and Link Structure for Hypertext Classification”, Getoor, Segal, Taskar and Koller in IJCAI 01 Workshop Text Learning: Beyond Classification

Roadmap: 

Roadmap Motivation and Background PRMs w/ Attribute Uncertainty PRMs w/ Structural Uncertainty PRMs w/ Class Hierarchies

From Instances to Classes in Probabilistic Relational Models: 

From Instances to Classes in Probabilistic Relational Models Compare two approaches Probabilistic Relational Models (PRMs) Bayesian Network (BNs) PRMs with Class Hierarchies (PRM-CH) bridge gap between BNs and PRMs Learning PRM-CHs hierarchy supplied discovering hierarchy

PRM for Collaborative Filtering: 

PRM for Collaborative Filtering Vote Program Voter Ranking + Dependency Model Relational Schema Person Age Gender Education

PRM Instantiation: 

PRM Instantiation TV-Program Nova Genre doc Budget low Timeslot primetime Network PBS Person Jane Doe Age elderly Gender female Education bs Income medium Vote #5630 Ranking ? TV-Program Seinfeld Genre sitcom Budget high Timeslot rerun Network ABC TV-Program Frasier Genre sitcom Budget medium Timeslot primetime Network ABC Person John Deer Age middle-aged Gender male Education hs Income low Vote #5631 Ranking ? Vote #5632 Ranking ? Vote #5632 Ranking ? Vote #5633 Ranking ? Income Income 1 . 0 6 . 0 3 . 0 5 . 0 4 . 0 1 . 0 4 . 0 5 . 0 1 . 0 1 . 0 4 . 0 5 . 0 bs hs sitcom bs doc hs doc h m l E G sitcom

BN for Collaborative filtering: 

BN for Collaborative filtering Law & Order Frasier NBC Monday Night Movies Mad about you Beverly Hills 90210 Seinfeld Friends Melrose Place Models Inc. Breese, et al. UAI-98

Limitations of PRMs: 

Limitations of PRMs In PRM, all instances of the same class must use the same dependency mode, it cannot distinguish: documentaries and sitcoms “60 Minutes” and Seinfeld PRM cannot have dependencies that are “cyclic” ranking for Frasier depends on ranking for Friends

Limitations of BNs: 

Limitations of BNs In BN, each instance has its own dependency model, cannot generalize over instances If John tends to like sitcoms, he will probably like next season’s offerings whether a person enjoys sitcom reruns depends on whether they watch primetime sitcoms BN can only model relationships between at most one class of instances at a time In previous model, cannot model relationships between people if my roommate watches Seinfeld I am more likely to join in

Desired Model : 

Desired Model Allows both class and instance dependencies

PRMs w/ Class Hierarchies : 

PRMs w/ Class Hierarchies Allows us to: Refine a “heterogenous” class into more coherent subclasses Refine probabilistic model along class hierarchy Can specialize/inherit CPDs Construct new dependencies that were originally “acyclic” Provides bridge from class-based model to instance-based model

PRM-CH: 

PRM-CH Person Age Gender Education Income Vote Program Voter Ranking Relational Schema Class Hierarchy SoapOpera

Learning PRM-CHs: 

Learning PRM-CHs Relational Schema Database: TVProgram Person Vote Person Vote Instance I Class hierarchy provided

Structure Selection: 

Structure Selection Idea: define scoring function do phased local search over legal structures Key Components: scoring models searching model space PRM w/ CHs new operators unchanged

Guaranteeing Acyclicity with Subclasses: 

Guaranteeing Acyclicity with Subclasses

Learning PRM-CH : 

Scenario 1: Class hierarchy is provided New Operators Specialize/Inherit Learning PRM-CH BudgetDocumentary

Learning Class Hierarchy : 

Learning Class Hierarchy Issue: partially observable data set Construct decision tree for class defined over attributes observed in training set class2 New operator Split on class attribute Related class attribute

Slide98: 

MOVIE Drama Comedy Romance Action Horror Thriller Theater Status Video Status Art/Foreign Classic VOTE Rating EachMovie+ PRM 1400 Movies 5000 People 240,000 Votes http://www.research.digital.com/SRC/EachMovie

Slide99: 

PRM-CH

Comparison : 

Comparison 5 Test Sets: 1000 votes, ~100 movies, ~115 people PRM Mean LL: -12,079, std 475.68 PRM-CH Mean LL: -10558, std 433.10 Using standard t-test, PRM-CH model outperforms PRM model with over 99% confidence

PRM-CH Summary: 

PRM-CH Summary PRMs with class hierarchies are a natural extension of PRMs: Specialization/Inheritance of CPDs Allows new dependency structures Provide bridge from class-based to instance-based models Learning techniques proposed Need efficient heuristics Empirical validation on real-world domains

Roadmap: 

Roadmap Motivation and Background PRMs w/ Attribute Uncertainty PRMs w/ Structural Uncertainty PRMs w/ Class Hierarchies

Next Time 2/25: 

Next Time 2/25 Focus Problems Please add your focus problem to the class wiki Give a PRM for the problem Give a PER for the problem Give at least one of the logical-based methods (BLP, LPRM, LBN) For each representation, discuss some modeling issue, or some novelty you used – e.g. structural uncertainty, constraints, etc. Readings for next three weeks 2/28 – Logic-based approaches 3/4 – Advanced Logic-based approaches 3/11 – Undirected Models Please sign up to lead the discussion for one of the papers 2/28 – 3/11 For each paper, please post your comments for each paper on the wiki by midnight Wed before the class in which they are assigned to be discussed. This gives the discussion leader some time to synthesize the comments.