10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be...

27
10/8
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be...

Page 1: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

10/8

Page 2: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Complexity of Propositional Inference

• Any sound and complete inference procedure has to be Co-NP-Complete (since model-theoretic entailment computation is Co-NP-Complete (since model-theoretic satisfiability is NP-complete))

• Given a propositional database of size d– Any sentence S that follows from the database by modus ponens can be

derived in linear time• If the database has only HORN sentences (sentences whose CNF form

has at most one +ve clause; e.g. A & B => C), then MP is complete for that database.

– PROLOG uses (first order) horn sentences

– Deriving all sentences that follow by resolution is Co-NP-Complete (exponential)

• Anything that follows by unit-resolution can be derived in linear time. – Unit resolution: At least one of the clauses should be a clause of length 1

(used in Davis-Putnam)

Page 3: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Search in Resolution• Convert the database into clausal form Dc

• Negate the goal first, and then convert it into clausal form DG

• Let D = Dc+ DG

• Loop – Select a pair of Clauses C1 and C2 from D

• Different control strategies can be used to select C1 and C2 to reduce number of resolutions tries

– Idea 1: Set of Support: Either C1 or C2 must be either the goal clause or a clause derived by doing resolutions on the goal clause (*COMPLETE*)

– Idea 2: Linear input form: Either C1 or C2 must be one of the clauses in the input KB (*INCOMPLETE*)

– Idea 3: Linear resolution: Pick the most recent resolvent as one of the pair of clauses (so resolution tree looks like a “line”). (*INCOMPLETE* in general, but COMPLETE for HORN caluses)

– Resolve C1 and C2 to get C12– If C12 is empty clause, QED!! Return Success (We proved the theorem; )– D = D + C12– End loop

• If we come here, we couldn’t get empty clause. Return “Failure”– Finiteness is guaranteed if we make sure that:

• we never resolve the same pair of clauses more than once; AND • we use factoring, which removes multiple copies of literals from a clause (e.g. QVPVP

=> QVP)

Page 4: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Mad chase for the empty clause…

• You must have everything in CNF clauses before you can resolve– Goal must be negated first before it is converted into CNF form

• Goal (the fact to be proved) may become converted to multiple clauses (e.g. if we want to prove P V Q, then we get two clauses ~P ; ~Q to add to the database

• Resolution works by resolving away a single literal and its negation– PVQ resolved with ~P V ~Q is not empty!

• In fact, these clauses are not inconsistent (P true and Q false will make sure that both clauses are satisfied)

– PVQ is negation of ~P & ~Q. The latter will become two separate clauses--~P , ~Q. So, by doing two separate resolutions with these two clauses we can derive empty clause

Page 5: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Solving problems using propositional logic

• Need to write what you know as propositional formulas• Theorem proving will then tell you whether a given new sentence will

hold given what you know• Three kinds of queries

– Is my knowledge base consistent? (i.e. is there at least one world where everything I know is true?) Satisfiability

– Is the sentence S entailed by my knowledge base? (i.e., is it true in every world where my knowledge base is true?)

– Is the sentence S consistent/possibly true with my knowledge base? (i.e., is S true in at least one of the worlds where my knowledge base holds?)

• S is consistent if ~S is not entailed• But cannot differentiate between degrees of likelihood among possible

sentences

Page 6: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Example

• Pearl lives in Los Angeles. It is a high-crime area. Pearl installed a burglar alarm. He asked his neighbors John & Mary to call him if they hear the alarm. This way he can come home if there is a burglary. Los Angeles is also earth-quake prone. Alarm goes off when there is an earth-quake.

Burglary => AlarmEarth-Quake => AlarmAlarm => John-callsAlarm => Mary-calls

If there is a burglary, will Mary call?

Check KB & E |= MIf Mary didn’t call, is it possible

that Burglary occurred? Check KB & ~M doesn’t entail

~B

Page 7: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Example (Real)• Pearl lives in Los Angeles. It is a high-

crime area. Pearl installed a burglar alarm. He asked his neighbors John & Mary to call him if they hear the alarm. This way he can come home if there is a burglary. Los Angeles is also earth-quake prone. Alarm goes off when there is an earth-quake.

• Pearl lives in real world where (1) burglars can sometimes disable alarms (2) some earthquakes may be too slight to cause alarm (3) Even in Los Angeles, Burglaries are more likely than Earth Quakes (4) John and Mary both have their own lives and may not always call when the alarm goes off (5) Between John and Mary, John is more of a slacker than Mary.(6) John and Mary may call even without alarm going off

Burglary => Alarm Earth-Quake => Alarm Alarm => John-callsAlarm => Mary-calls

If there is a burglary, will Mary call? Check KB & E |= MIf Mary didn’t call, is it possible that

Burglary occurred? Check KB & ~M doesn’t entail ~BJohn already called. If Mary also calls, is it

more likely that Burglary occurred?You now also hear on the TV that there was

an earthquake. Is Burglary more or less likely now?

Page 8: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

How do we handle Real Pearl?

• Eager way:– Model everything!– E.g. Model exactly the

conditions under which John will call

• He shouldn’t be listening to loud music, he hasn’t gone on an errand, he didn’t recently have a tiff with Pearl etc etc.

A & c1 & c2 & c3 &..cn => J (also the exceptions may have

interactions c1&c5 => ~c9 )

• Ignorant (non-omniscient) and Lazy (non-omnipotent) way:– Model the likelihood – In 85% of the worlds where

there was an alarm, John will actually call

– How do we do this?• Non-monotonic logics• “certainty factors”• “probability” theory?

Qualification and Ramification problems make this an infeasible enterprise

Page 9: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Probabilistic Calculus to the Rescue

Suppose we know the likelihood of each of the (propositional)

worlds (aka Joint Probability distribution )

Then we can use standard rules of probability to compute the likelihood of all queries (as I will remind you)

So, Joint Probability Distribution is all that you ever need!

In the case of Pearl example, we just need the joint probability distribution over B,E,A,J,M (32 numbers)--In general 2n separate numbers

(which should add up to 1)

If Joint Distribution is sufficient for reasoning, what is domain knowledge supposed to help us with?

--Answer: Indirectly by helping us specify the joint probability distribution with fewer than 2n numbers

---The local relations between propositions can be seen as “constraining” the form the joint probability distribution can take!

Burglary => Alarm Earth-Quake => Alarm Alarm => John-callsAlarm => Mary-calls

Only 10 (instead of 32) numbers to specify!

Page 10: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

10/10

Page 11: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Blog discussion • Number of clauses with n-variables (3n --a variable can be

present positively in a clause, negatively in a clause or not present in a clause—so 3 possibilities per variable)– Number of KBs with n variables ( 2(3^n))– But number of sets of worlds is only 2(2^n)

– ..more KBs than there are sets of worlds!– ……because multiple syntactically different KBs can refer to the

same set of worlds (e.g. the KB ={(P,Q); (P, -Q)} is equivalent to the KB {(P,R),(P,-R)}

• Non-monotonicity of probabilistic reasoning• Parameterized distributions

• Probability vs. Statistics– Is Inference/Reasoning (with the model) vs.

Learning/acquiring (getting the model)• Computing the posterior distribution given the current model and evidence is

not learning…

Page 12: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Easy Special Cases

• If in addition, each proposition is equally likely to be true or false, – Then the joint probability

distribution can be specified without giving any numbers!

• All worlds are equally probable! If there are n props, each world will be 1/2n probable

– Probability of any propositional conjunction with m (< n) propositions will be 1/2m

• If there are no relations between the propositions (i.e., they can take values independently of each other)– Then the joint probability

distribution can be specified in terms of probabilities of each proposition being true

– Just n numbers instead of 2n

Page 13: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Will we always need 2n numbers?

• If every pair of variables is independent of each other, then– P(x1,x2…xn)= P(xn)* P(xn-1)*…P(x1)– Need just n numbers!– But if our world is that simple, it would also be very uninteresting (nothing is

correlated with anything else!)

• We need 2n numbers if every subset of our n-variables are correlated together– P(x1,x2…xn)= P(xn|x1…xn-1)* P(xn-1|x1…xn-2)*…P(x1)– But that is too pessimistic an assumption on the world

• If our world is so interconnected we would’ve been dead long back…

A more realistic middle ground is that interactions between variables are contained to regions. --e.g. the “school variables” and the “home variables” interact only loosely (are independent for most practical purposes) -- Will wind up needing O(2k) numbers (k << n)

Page 14: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Probabilistic Calculus to the Rescue

Suppose we know the likelihood of each of the (propositional)

worlds (aka Joint Probability distribution )

Then we can use standard rules of probability to compute the likelihood of all queries (as I will remind you)

So, Joint Probability Distribution is all that you ever need!

In the case of Pearl example, we just need the joint probability distribution over B,E,A,J,M (32 numbers)--In general 2n separate numbers

(which should add up to 1)

If Joint Distribution is sufficient for reasoning, what is domain knowledge supposed to help us with?

--Answer: Indirectly by helping us specify the joint probability distribution with fewer than 2n numbers

---The local relations between propositions can be seen as “constraining” the form the joint probability distribution can take!

Burglary => Alarm Earth-Quake => Alarm Alarm => John-callsAlarm => Mary-calls

Only 10 (instead of 32) numbers to specify!

Page 15: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Prob. Prop logic: The Game plan• We will review elementary “discrete variable” probability• We will recall that joint probability distribution is all we need to answer

any probabilistic query over a set of discrete variables.• We will recognize that the hardest part here is not the cost of inference

(which is really only O(2n) –no worse than the (deterministic) prop logic– Actually it is Co-#P-complete (instead of Co-NP-Complete) (and the former is

believed to be harder than the latter)• The real problem is assessing probabilities.

– You could need as many as 2n numbers (if all variables are dependent on all other variables); or just n numbers if each variable is independent of all other variables. Generally, you are likely to need somewhere between these two extremes.

– The challenge is to • Recognize the “conditional independences” between the variables, and exploit them

to get by with as few input probabilities as possible and • Use the assessed probabilities to compute the probabilities of the user queries efficiently.

Page 16: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Propositional Probabilistic Logic

Page 17: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.
Page 18: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

CONDITIONAL PROBABLITIES

Non-monotonicity w.r.t. evidence– P(A|B) can be either higher, lower or equal to P(A)

Page 19: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Most useful probabilistic reasoning involves computing posterior

distributions

Probabilit

y

Variable values

P(A)

P(A|B=T)P(A|B=T;C=False)

Important: Computing posterior distribution is inference; not learning

Page 20: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

If B=>A then P(A|B) = ? P(B|~A) = ? P(B|A) = ?

Page 21: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

If you know th

e full j

oint,

You can answ

er ANY query

Page 22: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

& Marginalization

Page 23: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

TA ~TA

CA 0.04 0.06

~CA 0.01 0.89

P(CA & TA) =

P(CA) =

P(TA) =

P(CA V TA) =

P(CA|~TA) =

Page 24: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

TA ~TA

CA 0.04 0.06

~CA 0.01 0.89

P(CA & TA) = 0.04

P(CA) = 0.04+0.06 = 0.1 (marginalizing over TA)

P(TA) = 0.04+0.01= 0.05

P(CA V TA) = P(CA) + P(TA) – P(CA&TA) = 0.1+0.05-0.04 = 0.11

P(CA|~TA) = P(CA&~TA)/P(~TA) = 0.06/(0.06+.89) = .06/.95=.0631

Think of this as analogous to entailment by truth-table enumeration!

Page 25: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.
Page 26: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Problem:

--Need too many

numbers…

--The needed numbers

are harder to assess

You can avoid assessing P(E=e)if you assess P(Y|E=e) since it must add up to 1

Page 27: 10/8. Complexity of Propositional Inference Any sound and complete inference procedure has to be Co-NP- Complete (since model-theoretic entailment computation.

Digression: Is finding numbers the really hard assessement problem?

• We are making it sound as if assessing the probabilities is a big deal• In doing so, we are taking into account model acquisition/learning

costs. • How come we didn’t care about these issues in logical reasoning? Is it

because acquiring logical knowledge is easy?• Actually—if we are writing programs for worlds that we (the humans)

already live in, it is easy for us (humans) to add the logical knowledge into the program. It is a pain to give the probabilities..

• On the other hand, if the agent is fully autonomous and is bootstrapping itself, then learning logical knowledge is actually harder than learning probabilities..– For example, we will see that given the bayes network topology (“logic”),

learning its CPTs is much easier than learning both topology and CPTs