Embodied Learning of Qualitative Models
Jure Žabkar
Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011
joint work with xpero partners
problem
“How should a robot choose its actions and experiences so as to maximize the
effectiveness of its learning?”
goals
• to learn comprehensible models
• no extrinsic reward
• intrinsic reward: improved prediction model about the environment
our way
• learning from scratch(no explicit background knowledge, but given a learning algorithm)
• real robots, real-time learning
learning loop
1. observe the environment (collect data)2. learn a model3. use the model to predict the effect of
each action4. choose the best action (w.r.t. active
learning strategy)5. observe the environment and check
whether the predictions match new observations
starting scenarioQ: how does the area of the ball (as observed by the robot)change w.r.t. robot's actions?
area := #pixels of the red blob in the image from robot's camera
actions: sL, sR
(the distance of the L/R wheel)
area = area(sL,sR)
task: find the appropriate model
equation discovery?we tried several algorithms, no success
motivation
people most oftenreason qualitatively
AI: robots should mimic
human intelligence
why learning qualitative relations?
the area problem, qualitatively
if action=forward then the area increases until it becomes constant (blob occupies the whole image)
if orientation<0 and action=left (increasing the
absolute value of the angle) then the area decreases until it becomes constant (zero)
...
qualitative rules
prediction model gets much more accurate,but the predictions are
not that precise.
methods
• active learning + planning• learning methods:
PadéŽabkar, Možina, Bratko, Demšar Learning Qualitative Models from Numerical Data, AIJ, 2011
STRUDELKošmerlj, Bratko, Žabkar Embodied Concept Discovery through Qualitative Action Models, IJUFKS, 2011
QubeŽabkar et al Preference Learning from Qualitative Partial Derivatives, ECML Preference Learning Workshop, 2010
Hyper (with predicate invention mechanism)Leban, Žabkar, Bratko An experiment in robot discovery with ILP Proc. ILP 2008
• tested on simulated (billiards) and real data (medical application, robotics)
ceteris paribus
• e.g. partial differentiation• observe a qualitative relation
between two selected features, other features held constant
• qualitative relations of 3 types:– x increases f(x) increases (Padé)– preference relation: x y f(x) f(y) – structural: on(A,B,t1), on(A,C,t2)
"all other things being equal"
qualitative modelsdata
qualitative
changes
qualitative models
Padé, Qube, STRUDEL
machine learning,statistics
qualitative modelsdata
qualitative
changes
qualitative models
Padé, Qube, STRUDEL
machine learning,statistics
qualitative modelsdata
qualitative
changes
qualitative models
Padé, Qube, STRUDEL
machine learning,statistics
learning with structured data• ILP with predicate invention too
complex for real-time learning
• we use ILP to learn smaller subtasks – structural qualitative changes
www.ailab.si/xpero
the concept "movable"the discovered condition which distinguishes different effects of actions:p1(Obj):-
at(T1, Obj, Pos1),at(T2, Obj, Pos2),neq_pos(Pos1, Pos2).
move(T, Obj):-p1(Obj),f1(T, Obj).
move(T, Obj):-not p1(Obj),f2(T, Obj).
f1(T1, Obj):-at(T1, Obj, Pos1),at(T2, Obj, Pos2),Pos1 \== Pos2,{T2 = T1+1}.
f2(T, Obj):-not f1(T, Obj).
the discovered effects of actions:
p1 is true if the object was observed at two different positions
Top Related