Modelling Relational Statistics With Bayes Nets

Click here to load reader

download Modelling  Relational Statistics With  Bayes  Nets

of 12

description

School of Computing Science Simon Fraser University Vancouver, Canada. Modelling Relational Statistics With Bayes Nets. Class-Level and Instance-Level Queries. Classic AI research distinguished two types of probabilistic relational queries. ( Halpern 1990, Bacchus 1990). . - PowerPoint PPT Presentation

Transcript of Modelling Relational Statistics With Bayes Nets

Slide 1

Modelling Relational Statistics With Bayes Nets

School of Computing ScienceSimon Fraser UniversityVancouver, Canada

Oliver Schulte

Hassan Khosravi

Arthur Kirkpatrick

Tianxiang GaoYuke Zhu

#/12If you use insert slide number under Footer, that text box only displays the slide number, not the total number of slides. So I use a new textbox for the slide number in the master.This is a version of Equity.1Class-Level and Instance-Level QueriesClassic AI research distinguished two types of probabilistic relational queries. (Halpern 1990, Bacchus 1990). Halpern, An analysis of first-order logics of probability, AI Journal 1990.Bacchus, Representing and reasoning with probabilistic knowledge, MIT Press 1990.Relational QueryClass-level QueryReference ClassWhat is the percentage of flying birds?BirdsWhat is the percentage of friendship pairs where both are women?Pairs of FriendsWhat is the percentage of A grades awarded to highly intelligence students?Student-course pairs where student is registered in course.Instance-Level QueryGiven that Tweety is a bird, what is the probability that Tweety flies?Given that Sam and Hilary are friends, and given the genders of their other friends, what is the probability that Sam and Hilary are both women?What is the probabiity that Jack is highly intelligent given his grades?Instance-level queriesGround factsType 2 probabilitiesClass-level queriesRelational StatisticsType 1 probabilities#/12M. Chiang and D. Poole (2012), Reference classes and relational learning2Visualizing Class-Level ProbabilityModelling Relational Statistics With Bayes Nets

Percentage of Flying Birds = 90%. Halpern: Probability that a typical or random bird flies is 90%.

Contains some free variables.e.g. P(Flies(B)) = ?.Syntactic Distinction Contains no free variables.e.g. P(Flies(tweety)) = ?.#/12Applications of Class-Level Modelling1st-order rule learning (e.g., intelligent students take difficult courses).Strategic Planning (e.g., increase SAT requirements to decrease student attrition).Query Optimization (Getoor, Taskar, Koller 2001). Class-level queries support selectivity estimation optimal evaluation order for SQL query .

Getoor, Lise, Taskar, Benjamin, and Koller, Daphne. Selectivity estimation using probabilistic models. ACM SIGMOD Record, 30(2):461472, 2001.#/12No Grounding Semantics for Class-level QueriesUnrolling a network model of individual entities.No classes, cannot ask class-level queries.Modelling Relational Statistics With Bayes Netsaintelligence(S)diff(C)Registered(S,C)Class-level Templatewith 1st-order Variablesintelligence(jack)diff(100)Registered(jack,100)intelligence(jane)diff(200)Registered(jack,200)Registered(jane,100)Registered(jane,200)Instance-level Model w/domain(S) = {jack,jane}domain(C) = {100,200}#/12Previous Work: Probabilistic Queries in Statistical-Relational LearningClass-LevelInstance-LevelStatistical-Relational Models(Lise Getoor, Taskar, Koller 2001)Many Model Types:Probabilistic Relational Models, Markov Logic Networks,Bayes Logic Programs,Logical Bayesian Networks,

#/12New Unified ApproachDavid Poole, First-Order Probabilistic Inference, IJCAI 2003. H. Khosravi, O. Schulte, T. Man, X. Xu, and B. Bina, Structure learning for Markov logic networks withmany descriptive attributes, in AAAI, 2010.O. Schulte and H. Khosravi. Learning graphical models for relational data via lattice search. Machine Learning, 2012.

Class-LevelInstance-LevelParametrized Bayes Nets+ new class-level semanticsParametrized Bayes Nets+ combining rules (Poole 2003)+ log-linear model(Khosravi, Schulte et al. 2010, Schulte and Khosravi 2012)

#/12Random Selection Semantics: ExampleApply the random selection semantics for probabilistic 1st-order logic (Halpern 1990; Bacchus 1990).Halpern, An analysis of first-order logics of probability, AI Journal 1990.Bacchus, Representing and reasoning with probabilistic knowledge, MIT Press 1990.intelligence(S)diff(C)Registered(S,C)P(intelligence(S) = hi, diff(C) = hi, Registered(S,C) = true) = 20% means:

hihitrueif we randomly select a student and a course, then the probability is 20% that the student is registered in the course, and that the intelligence of the student and the difficulty of the course are high.#/12Computing Parameter Estimates (I)Use conditional database probabilities as Bayes net parameters. Maximizes the random selection pseudo-likelihood (Schulte 2011).For database probabilities with all true relationships, use SQL or Virtual Join (Yin, Han et al. 2004).

Schulte, O. A tractable pseudo-likelihood function for Bayes nets applied to relational data. SIAM SDM, 2011. Yin, X., Han. J. et al. CrossMine: Efficient Classification Across Multiple Database Relations.Constraint-Based Mining and Inductive Databases, 2004. R1R2#/12Computing Parameter Estimates (II)How to compute database probabilities for negated relations?e.g., number of U.S. users who are not friends?Materializing complement tables is unscalable.For single false relation, 1-minus trick (Getoor et al. 2007).General case: New application of the fast Mbius transform (Kennes and Smits 1990).Getoor, Lise, Friedman, Nir, Koller, Daphne, Pfeffer, Avi, and Taskar, Benjamin. Probabilistic relational models, 2007.Kennes, Robert and Smets, Philippe. Computational aspects of the Mobius transformation. In UAI, 1990. #/12The Mbius ParametrizationModelling Relational Statistics With Bayes NetsaR1R2Count(*)R1R2Count(*)R1R2Count(*)R1R2Count(*)For two link typesR1R2Count(*)R1Count(*)R2Count(*)Count(*)no conditionJoint probabilitiesMbius Parameters#/12EvaluationFast: parameters in minutes or less.Accurate queries/estimates.Try it yourself in our demo!

Modelling Relational Statistics With Bayes Nets#/12Complexity argument: at each step, consider all rows with Ri = *. All other columns have one of two values: T and * for j > i, and T and F for j < i.R is list of relationshiips, R_i is single relationship. Sigma is arbitrary condition on attributes.

12