© Enn Tyugu1 Algorithms of Artificial Intelligence Lecture 2: Knowledge E. Tyugu Spring 2003.

© Enn Tyugu 1

Algorithms of Artificial Intelligence

Lecture 2: Knowledge

E. Tyugu

Spring 2003

© Enn Tyugu 2

Prolog Prolog is a logic-based programming language, i.e. a language for logic

programming. Its statements are Horn clauses.

Examples:

A program: ancestor(X,Z):-parent(X,Z).

ancestor(X,Z):-parent(Y,Z),ancestor(X,Y).

A program:

state(1,0).

state(S,T):-state(S1,pred(T)), nextstate(S,S1,pred(T)).

nextstate(X+1,X,T).

A goal: ?- state(X,3).

© Enn Tyugu 3

Prolog interpreterprog - program to be executed;

goals - list of goals, which initially contains the goal given by a user;

unifier(x,y) - produces the most general unifier of x and y, or nil ;

apply(x,L) - applies a unifier x to each element of a list L producing a new list.

© Enn Tyugu 4

Prolog interpreter

A.1.3: exec(prog,goals,success)=if empty(goals) then success( ) else goal:= head(goals);

goals:= tail(goals);L: {rest:= prog; while not empty(rest) do

U:=unifier(goal,head(head(rest));if U nil

then goals1:= (apply(U,tail(head(rest)));

exec(prog,goals1,success);if success then exit L fi

fi;rest:=tail(rest);

od; failure( )

};exec(prog,goals, succes)

fi

© Enn Tyugu 5

Semantic networks

Linguists noticed long ago that the structure of a sentence can be represented as a network. Words of the sentence are nodes, and they are bound by arcs expressing relations between the words. The network as a whole represents in this way a meaning of the sentence in terms of meanings of words and relations between the words. This meaning is an approximation of the meaning people can assign to the sentence, analogous in a way to other approximate representations of the meaning, for instance, how floating point numbers represent the approximate meaning of real numbers.

© Enn Tyugu 6

Example

give

pick up

have a meeting

before

after

after

morning lunch

at the time

who?

John

who?

who?

what?

report

to whom?

me

whos?

his

John must pick up his report in the morning and have a meeting after lunch. After the meeting he will give the report to me.

© Enn Tyugu 7

Example continued

Inferences can be made, depending on the properties of the relations of a semantic network. Let us consider only time relations of the network in our example, and encode the time relations by atomic formulas as follows:

before(lunch,morning) = general knowledge

after(morning,lunch)

after(lunch,have a meeting) = specific knowledge

after(have a meeting,give) at-the-time(morning,pick up)

© Enn Tyugu 8

Example continuedInference rules:

before(x,y) before(y,z) before(x,z)

after(x,y)before(y,x)

at-the-time(x,z) before(y,z) before(y,x)

Applying these rules, we can infere

after(lunch,have a meeting) before(have a meeting,lunch) and

at-the-time(pick up,morning) before(lunch,morning)

before(lunch,pick up) etc.

© Enn Tyugu 9

Frames1. The essence of the frame is that it is a module of

knowledge about something which we can call a concept. This can be a situation, an object, a phenomenon, a relation.

2. Frames contain smaller pieces of knowledge: components, attributes, actions which can be (or must be) taken when conditions for taking an action occur.

3. Frames contain slots which are places to put pieces of knowledge in. These pieces may be just concrete values of attributes, more complicated objects, or even other frames. A slot is being filled in when a frame is applied to represent a particular situation, object or phenomenon.

© Enn Tyugu 10

Inheritance

ideas

events states things

actions abstract things

polygons

triangles quadrangles

parallelograms

rectangles rhombuses

An essential idea developed in connection with frames was inheritance. Inheritance is a convenient way of reusing existing knowledge in describing new frames. Knowing a frame f, one can describe a new frame as a kind of f, meaning that the new frame inherits the properties of f, i.e. it will have these properties in addition to newly described properties described. Inheritance relation expresses very precisely the relation between super- and subconcepts.

© Enn Tyugu 11

Default theories

A default has the following form

A:B1, ... , Bk--------------

C

where the formula A is a premise, the formula C is a conclusion and the formulas B1, ..., Bk are justifications. Conclusion of the default can be derived from its premise, if there is no negation of any

justification derived.

© Enn Tyugu 12

Examples

1. bird(x): flies(x) --------------------- flies(x)

2. Closed world assumption (CWA):

:not F------not F

© Enn Tyugu 13

Derivation step with a default.

A - premise of a defaultC - conclusion of a defaultJ - justifications of default

A1.4 Default(A,C,J): for B J do

if derrivable( B) then failure fi od; success( )

© Enn Tyugu 14

Rules

Rules are a well-known form of knowledge which is easy to use. A rule is a pair

(condition, action)

which has the meaning: "If the condition is satisfied, then the action can be taken." Also other modalities for performing the action are possible - "must be taken", for

instance.

© Enn Tyugu 15

Using rules

Let us have a set of rules called rules and functions cond(p) and act(p) which select the condition part and action part of a given rule p and present them in the executable form. The following is a simple algorithm for problem solving with rules:

A.1.5while not good dofound := false;for p rules doif cond(p) then act(p); found:=true fiod;if not found then failure fiod

© Enn Tyugu 16

Decision trees

A simple way to represent rules is decision tree: a tree with nodes for attributes and arcs for attribute values. Example:

legs

two four

handsno

yes

furryno yes

table animalfurry birdno

monkey man

yes

© Enn Tyugu 17

Rete algorithm

Rete algorithm uses a data structure that enables fast search of applicable rules. We shall consider it in two parts:

• knowledge representation, • knowledge management (i.e. introduction of changes into the

knowledge base).

Any rule that is reachable in the Rete graph (see below) via nonempty relation nodes can be fired.

Rete algorithm is used in JESS (Java Expert System Shell) and its predecessor – CLIPS (both developed in NASA.)

© Enn Tyugu 18

Rete algorithm continued

Knowledge includes: 1. facts, e.g.

(goal e1 simplify), (goal e2 simplify), (goal e3 simplify), (expr e1 0 + 3), (expr e2 0 + 5), (expr e3 0 * 2),...

2. patterns, e.g.(goal ?x simplify)

(expr ?y 0 ?op ?a2)(parent ?x ?y)

...

3. and rules, e.g.(R1 (goal ?x simplify) (expr ?x 0 + ?y) => (expr ?x ?y))(R2 (goal ?x simplify) (expr ?x 0 * ?y) => (expr ?x 0))

...

© Enn Tyugu 19


Knowledge is represented in the form of an acyclic graph. It is for the presented example as follows:

root

goal expr

goal expr * expr + ...

*** ***

R2 R1

xe1e2e3

y2

y35

x ye3 2

x ye1 3e2 5

© Enn Tyugu 20


The overall structure of the Rete graph is the following:

root

predicate names layer

patterns layer - alpha nodes (with one input)

rules layer - one node for every rule

beta-nodes(with two inputs)

© Enn Tyugu 21

Adding facts to Rete graph

When a fact arrives then

1. Select the predicate

2. Select the pattern

3. For every relation depending on the selected pattern update the relation (add a new line to the relation).

© Enn Tyugu 22

Rete algorithm continuedThe Rete graph is built, updated and used as follows:1. One level down from the root are placed all predicate names.2. The next level down contains alpha-nodes for all patterns of all rules

as successors of their predicate names. 3. Beta-nodes of the following levels down (with two inputs each)

include relations that unify with the patterns along the path from the root to the node.

4. The paths lead finally to nodes representing rules. 5. When a new knowledge item arrives, it is placed into the correct places.

Finding the places is simple and straightforward, because it is guided by a relation in every node.

6. When a goal is given, the search is simple and straightforward, because it is guided by a relation in every node.

© Enn Tyugu 23

Rules with plausibilities

Rules can be extended by adding plausibility values to them. Let us associate with each rule p a plausibility value c(p) of application of the rule. These values can be in the range from 0 to 1. We shall consider as satisfactory only the results of application of a sequence of rules p, ..., q for which the plausibilities c(p) ,..., c(q) satisfy the condition c(p) * ... * c(q) > cm, where cm is the minimal satisfactory plausibility of the result. When selecting a new applicable rule, it is reasonable now to select a rule with the highest value of plausibility

© Enn Tyugu 24

Plausibilities

A.1.6c:=1;while not good do

x:=cm;for p rules do

if cond(p) and c(p) > x then a:=act(p); x:=c(p)fi

od;c:=c*x;if c > cm then a else failure fi

od;success

© Enn Tyugu 25

Using a plausibility function

A.1.7c:=1;while not good do

x:=cm;for p rules do

if cond(p) and plausibility(c(p),c) > x then a:=act(p); x:=plausibility(c(p),c)fi

od;c:=x;if c > cm then a else failure( ) fi

od;success( )

© Enn Tyugu 26

Classification of knowledge systems

KNOWLEDGE SYSTEMS

Symbolic (derivability, soundness, completeness)

Rules

(effcient computabiliy)

Semantic networks (eloquence, simplicity)

Frames (modularity, inheritance)

© Enn Tyugu 27

ExerciseFacts: parent(pam,bob).

parent(tom,bob).parent(tom,liz). parent(bob,ann). parent(bob,pat). parent(pat,jim).

Questions and answers:?- parent(bob,pat). … yes?- parent(liz,pat). … no?- parent(X,liz). … X = tom?- parent(bob,X). … X = ann ; … X = pat; … no

© Enn Tyugu 28

Bibliography

• Bratko, I. (2001) Prolog Programming for Artificial Intelligence. Addison Wesley.

• http://herzberg.ca.sandia.gov/jess/docs/ (Jess ja rete algoritm)

• Genesereth, M., Nilsson, N. (1986) Logical Foundations of Artificial Intelligence. Morgan Kauffmann.

http://herzberg.ca.sandia.gov/jess/docs/











© Enn Tyugu1 Algorithms of Artificial Intelligence Lecture 2: Knowledge E. Tyugu Spring 2003.

Documents

Transcript of © Enn Tyugu1 Algorithms of Artificial Intelligence Lecture 2: Knowledge E. Tyugu Spring 2003.