International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based...

63
International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 21 March 2022
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based...

International GraduateSchool of DynamicIntelligent Systems

Machine Learning

RG Knowledge Based Systems

Hans Kleine Büning

19 April 2023

Hans Kleine Büning9 January 2009

2RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Outline

Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise

Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise

Hans Kleine Büning9 January 2009

3RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Outline

Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise

Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise

Hans Kleine Büning9 January 2009

4RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Motivation

Partly inspired by human learning

Objectives: Classify entities according to some given examples Find structures in big databases Gain new knowledge from the samples

Input: Learning examples with Assigned attributes Assigned classes

Output: General Classifier for the given task

Hans Kleine Büning9 January 2009

5RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Classifying Training Examples

Training Example for EnjoySport

General Training Examples

Hans Kleine Büning9 January 2009

6RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Attributes & Classes

Attribute: Ai

Number of different values for Ai: |Ai|

Class: Ci

Number of different classes: |C|

Premises: n > 2 Consistent examples

(no two objects with the same attributes and different classes)

Hans Kleine Büning9 January 2009

7RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Possible Solutions

Decision Trees ID3 C4.5 CART

Rule Based Systems

Clustering

Neural Networks Backpropagation Neuroevolution

Hans Kleine Büning9 January 2009

8RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Decision Trees

Idea: Classify entities using if-then-rules

Example: Classifing Mushrooms Attributes: Color, Size, Points Classes: eatable, poisonous

Resulting rules: if (Colour = red)

and (Size = small) then poisonous

if (Colour = green)then eatable

Color Size Points Class

red

brown

brown

green

red

small

small

big

small

big

yes

no

yes

no

no

poisonous

eatable

eatable

eatable

eatable

Colour

poisonous/1

Size eatable/1 eatable/2

red green brown

eatable/1

small big

Hans Kleine Büning9 January 2009

9RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Decision Trees

There exist different decision trees for the same task.

In the mean the left tree decides earlier.

Colour

poisonous/1

Size eatable/1 eatable/2

red green brown

eatable/1

small big

Size

poisonous/1

Points eatable/2

small big

eatable/2

yes no

Hans Kleine Büning9 January 2009

10RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

How to measure tree quality?

Number of leafs? Number of generated rules

Tree height? Maximum rule length

External path length? = Sum of the length of all paths from root to leaf Amount of memory needed for all rules

Weighted external path length Like external path length Paths are weighted by the number of objects they represent

Hans Kleine Büning9 January 2009

11RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Back to the Example

Colour

poisonous/1

Size eatable/1 eatable/2

red green brown

eatable/1

small big

Size

poisonous/1

Points eatable/2

small big

eatable/2

yes no

Criterion Left Tree Right Tree

number of leafs 4 5

height 2 2

external path length 6 5

weighted external path length 7 8

Hans Kleine Büning9 January 2009

12RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Weighted External Path Length

Idea from information theory: Given:

Text which should be compressed Probabilities for character occurrence

Result: Coding tree

Example: eeab p(e) = 0.5 p(a) = 0.25 p(b) = 0.25 Encoding: 110001

Build tree according to the information content.

0

e

a b

10

1

Hans Kleine Büning9 January 2009

13RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Entropy

Entropy = Measurement for mean information content

In general:

Mean number of bits to encode each element by optimal encoding.(= mean height of the theoretically optimal encoding tree)

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

Hans Kleine Büning9 January 2009

14RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Information Gain

Information gain = expected reduction of entropy due to sorting

Conditional Entropy:

Information Gain:

Hans Kleine Büning9 January 2009

15RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Use conditional entropy and information gain for selecting split attributes.

Chosen split attribute Ak:

Possible values for Ak:

xi – Number of objects with value ai for Ak

xi,j – Number of objects with value ai for Ak and class Cj

Probability that one of the objects has attribute ai

Probability that an object with attribute ai has class Cj

Probability that one of the objects has attribute ai

Entropy & Decision Trees

Probability that one of the objects has attribute ai

Hans Kleine Büning9 January 2009

16RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Decision Tree Construction

Choose split attribute Ak which gives the highest information

gain or the smallest

Example: colour

Color Size Points Class

red

brown

brown

green

red

small

small

big

small

big

yes

no

yes

no

no

poisonous

eatable

eatable

eatable

eatable

Hans Kleine Büning9 January 2009

17RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Decision Tree Construction (2)

Analogously: H(C|Acolour) = 0.4

H(C|Asize) ≈ 0.4562

H(C|Apoints) = 0.4

Choose colour or points as first split criterion

Recursively repeat this procedure

Points

Colour Size Class

red small poisonous

brown big eatable

Colour Size Class

red big eatable

brown small eatable

green small eatable

yes no

Hans Kleine Büning9 January 2009

18RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Decision Tree Construction (3)

Right side is trivial:

Left side: both attributes have the same information gain

Points

Colour Size Class

red small poisonous

brown big eatable

yes no

eatable/3

Points

yes no

eatable/3Colour

eatable/1poisonous/1

greenred

Hans Kleine Büning9 January 2009

19RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Generalisation

The classifier should also be able to handle unknown data.

Classifing model is often called hypothesis.

Testing Generality: Divide samples into

Training set Validation or test set

Learn according to training set Test generality according to validation set

Error computation: Test set X Hypothesis h error(X,h) – Function which is monotonously

increasing in the number of wrongly classified examples in X by h

Hans Kleine Büning9 January 2009

20RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Overfitting

Learnt hypothesis performs good on training set but bad on validation set

Formally:h is overfitted if there exists a hypothesis h’ with error(D,h) < error(D,h’) and error(X,h) > error(X,h’)

X validation setD training set

Hans Kleine Büning9 January 2009

21RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Avoiding Overfitting

Stopping Don‘t split further if some criteria is true Examples:

Size of node n:Don‘t split if n contains less then ¯ examples.

Purity of node n:Don‘t split of purity gain is not big enough.

Pruning Reduce decision tree after training. Examples:

Reduced Error Pruning Minimal Cost-Complexity Pruning Rule-Post Pruning

Hans Kleine Büning9 January 2009

22RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Pruning

Pruning Syntax:

If T0 was produced by (repeated) pruning on T we write

n

n n

T Tn T/Tn

Hans Kleine Büning9 January 2009

23RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Maximum Tree Creation

Before pruning we need a maximum tree Tmax

What is a maximum tree? All leaf nodes are smaller then some threshold or All leaf nodes represent only one class or All leaf nodes have only objects with the same attribute values

Tmax is then pruned starting from the leaves.

Hans Kleine Büning9 January 2009

24RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Reduced Error Pruning

1. Consider branch Tn of T

2. Replace Tn by leaf with the class that is mostly associated with

Tn

3. If error(X, h(T)) < error(X, h(T/Tn)) take back the decision

4. Back to 1. until all non-leaf nodes were considered

Hans Kleine Büning9 January 2009

25RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Exercise

Fred wants to buy a VW Beetle and classifies all offering in the classes interesting and uninteresting. Help Fred by creating a decision tree using the ID3 algorithm.

Colour Year of Construction Mileage Class

redblue

greenred

greenblue

yellow

1975198019751975 197019751970

> 200 000 km> 200 000 km< 200 000 km> 200 000 km< 200 000 km> 200 000 km< 200 000 km

interestinguninterestinginterestinginteresting

uninterestinguninterestinginteresting

Hans Kleine Büning9 January 2009

26RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Outline

Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise

Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise

Hans Kleine Büning9 January 2009

27RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Hans Kleine Büning9 January 2009

28RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Reinforcement Learning: The Idea

A way of programming agents by reward and punishment without specifying how the task is to be achieved

Hans Kleine Büning9 January 2009

29RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Learning to Balance on a Bicycle

States: Angle of handle bars Angular velocity of handle bars Angle of bicycle to vertical Angular velocity of bicycle to

vertical Acceleration of angle of bicycle

to vertical

Hans Kleine Büning9 January 2009

30RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Learning to Balance on a Bicycle

Actions: Torque to be applied to the

handle bars Displacement of the center of

mass from the bicycle’s plan (in cm)

Hans Kleine Büning9 January 2009

31RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Angle of bicycle to vertical is greater

than 12°

Reward = 0

Reward = -1

no yes

Hans Kleine Büning9 January 2009

32RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Reinforcement Learning: Applications

Board Games TD-Gammon program, based on reinforcement learning, has

become a world-class backgammon player

Control a Mobile Robot Learning to Drive a Bicycle Navigation Pole-balancing Acrobot

Robot Soccer

Learning to Control Sequential Processes Elevator Dispatching

Hans Kleine Büning9 January 2009

33RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Deterministic Markov Decision Process

Hans Kleine Büning9 January 2009

34RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Value of Policy and Agent’s Task

Hans Kleine Büning9 January 2009

35RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Nondeterministic Markov Decision Process

P = 0

.8

P = 0.1

P = 0.1

Hans Kleine Büning9 January 2009

36RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Methods

Dynamic Programming

ValueFunction

Approximation+

DynamicProgramming

ReinforcementLearning

ValuationFunction

Approximation+

ReinforcementLearning

continuousstates

discrete states discrete statescontinuous

states

Model (reward function and transitionprobabilities) is known

Model (reward function or transitionprobabilities) is unknown

Hans Kleine Büning9 January 2009

37RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Q-learning Algorithm

Hans Kleine Büning9 January 2009

38RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Q-learning Algorithm

Hans Kleine Büning9 January 2009

39RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example

Hans Kleine Büning9 January 2009

40RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Q-table Initialization

Hans Kleine Büning9 January 2009

41RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 1

Hans Kleine Büning9 January 2009

42RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 1

Hans Kleine Büning9 January 2009

43RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 1

Hans Kleine Büning9 January 2009

44RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 1

Hans Kleine Büning9 January 2009

45RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 1

Hans Kleine Büning9 January 2009

46RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Q-table

Hans Kleine Büning9 January 2009

47RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 1

Hans Kleine Büning9 January 2009

48RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Episode 1

Hans Kleine Büning9 January 2009

49RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Q-table

Hans Kleine Büning9 January 2009

50RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 2

Hans Kleine Büning9 January 2009

51RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 2

Hans Kleine Büning9 January 2009

52RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Episode 2

Hans Kleine Büning9 January 2009

53RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Q-table after Convergence

Hans Kleine Büning9 January 2009

54RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Value Function after Convergence

Hans Kleine Büning9 January 2009

55RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Optimal Policy

Hans Kleine Büning9 January 2009

56RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Example: Optimal Policy

Hans Kleine Büning9 January 2009

57RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Q-learning

Hans Kleine Büning9 January 2009

58RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Convergence of Q-learning

Hans Kleine Büning9 January 2009

59RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Blackjack

Standard rules of blackjack hold State space:

element[0] - current value of player's hand (4-21)

element[1] - value of dealer's face -up card (2-11)

element[2] - player does not have usable ace (0/1)

Starting states: player has any 2 cards (uniformly

distributed), dealer has any 1 card (uniformly distributed)

Actions: HIT STICK

Rewards: 1 for a loss 0 for a draw 1 for a win

Hans Kleine Büning9 January 2009

60RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Blackjack: Optimal Policy

Hans Kleine Büning9 January 2009

61RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Exercise:

Hans Kleine Büning9 January 2009

62RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Exercise:

Hans Kleine Büning9 January 2009

63RG Knowledge Based SystemsUniversity of Paderborn

International GraduateSchool of DynamicIntelligent Systems

Problems

Multiagent Systems Cooperative Agents Competitive Agents

Continuous Domains

Partially observable MDP (POMDP)